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Abstract. In two-player games on graph, the players construct an infinite path 
through the game graph and get a reward computed by a payoff function over in- 
finite paths. Over weighted graphs, the typical and most studied payoff functions 
compute the limit-average or the discounted sum of the rewards along the path. 
Beside their simple definition, these two payoff functions enjoy the property that 
memoryless optimal strategies always exist. 

In an attempt to construct other simple payoff functions, we define a class of pay- 
off functions which compute an (infinite) weighted average of the rewards. This 
new class contains both the limit-average and discounted sum functions, and we 
show that they are the only members of this class which induce memoryless opti- 
mal strategies, showing that there is essentially no other simple payoff functions. 



1 Introduction 

Two-player games on graphs have many apphcations in computer science, such as the 
synthesis problem [7], and the model-checking of open reactive systems [1]. Games 
are also fundamental in logics, topology, and automata theory [18, 15, 21]. Games with 
quantitative objectives have been used to design resource-constrained systems [28, 9, 3, 
4], and to support quantitative model-checking and robustness [5, 6, 27]. 

In a two-player game on a graph, a token is moved by the players along the edges 
of the graph. The set of states is partitioned into player- 1 states from which player 1 
moves the token, and player-2 states from which player 2 moves the token. The inter- 
action of the two players results in a play, an infinite path through the game graph. In 
quaUtative zero-sum games, each play is winning for one of the player; in quantitative 
games, a payoff function assigns a value to every play, which is paid by player 2 to 
player 1. Therefore, player 1 tries to maximize the payoff while player 2 tries to mini- 
mize it. Typically, the edges of the graph carry a reward, and the payoff is computed as 
a function of the infinite sequences of rewards on the play. 

Two payoff functions have received most of the attention in literature: the mean- 
payoff function (for example, see [1 1, 28, 16, 20, 12, 22]) and the discounted-sum func- 
tion (for example, see [25, 12, 23, 24, 9]). The mean-payoff value is the long-run average 
of the rewards. The discounted sum is the infinite sum of the rewards under a discount 
factor < A < 1. For an infinite sequence of rewards w = wqWi . . . , we have: 
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While these payoff functions have a simple, intuitive, and mathematically elegant def- 
inition, it is natural to ask why they are playing such a central role in the study of 
quantitative games. One answer is perhaps that memoryless optimal strategies exist for 
these objectives. A strategy is memoryless if it is independent of the history of the play 
and depends only on the current state. Related to this property is the fact that the prob- 
lem of deciding the winner in such games is in NP n coNP, while no polynomial time 
algorithm is known for this problem. The situation is similar to the case of parity games 
in the setting of qualitative games where it was proved that the parity objective is the 
only prefix-independent objective to admit memoryless winning strategies [8], and the 
parity condition is known as a canonical way to express w-regular languages [26]. 

In this paper, we prove a similar result in the setting of quantitative games. We con- 
sider a general class of payoff functions which compute an infinite weighted average of 
the rewards. The payoff functions are parameterized by an infinite sequence of rational 
coefficients {c„}„>o, and defined as follows: 
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We consider this class of functions for its simple and natural definition, and because 
it generalizes both mean-payoff and discounted-sum which can be obtained as special 
cases, namely for Cj = 1 for all"* i > 0, and q = A* respectively. We study the prob- 
lem of characterizing which payoff functions in this class admit memoryless optimal 
strategies for both players. Our results are as follows: 

1. If the series X^i^o converges (and is finite), then discounted sum is the only 
payoff function that admits memoryless optimal strategies for both players. 

2. If the series J^t^o '^ocs not converge, but the sequence {c„}„>o is bounded, then 
for memoryless optimal strategies the payoff function is equivalent to the mean- 
payoff function (equivalent for the optimal value and optimal strategies of both 
players). 

Thus our results show that the discounted sum and mean-payoff functions, beside 
their elegant and intuitive definition, are the only members from a large class of natural 
payoff functions that are simple (both players have memoryless optimal strategies). 
In other words, there is essentially no other simple payoff functions in the class of 
weighted infinite average payoff functions. This further establishes the canonicity of 
the mean-payoff and discounted-sum functions, and suggests that they should play a 
central role in the emerging theory of quantitative automata and languages [10, 17,2, 
5]. 

In the study of games on graphs, characterizing the classes of payoff functions that 
admit memoryless strategies is a research direction that has been investigated in the 
works of [13, 14] which give general conditions on the payoff functions such that both 
players have memoryless optimal strategies, and [19] which presents similar results 
when only one player has memoryless optimal strategies. The conditions given in these 
previous works are useful in this paper, in particular the fact that it is sufficient to check 



' Note that other sequences also define the mean-payoff function, such as Cj = 1 -I- 1/2'. 
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that memory less strategies are sufficient in one-player games [14]. However, conditions 
such as sub-mixing and selectiveness of the payoff function are not immediate to es- 
tabUsh, especially when the sum of the coefficients {c„}„>o does not converge. We 
identify the necessary condition of boundedness of the coefficients {c„}„>o to derive 
the mean-payoff function. Our results show that if the sequence is convergent, then dis- 
counted sum (specified as {A"}„>o, for A < 1) is the only memoryless payoff function; 
and if the sequence is divergent and bounded, then mean-payoff (specified as {A"}„>o 
with A = 1) is the only memoryless payoff function. However we show that if the se- 
quence is divergent and unbounded, then there exists a sequence {A"}„>o, with A > 1, 
that does not induce memoryless optimal strategies. 

2 Definitions 

Game graphs. A two-player game graph G — {Q, E. w) consists of a finite set Q of 
states partitioned into player-1 states Qi and player-2 states Q2 (i.e., Q = Qi U Q2), 
and a set £^ C Q X Q of edges such that for all q £ Q, there exists (at least one) q' € Q 
such that {q, q') G E. The weight function w : E ^ Q assigns a reward to each edge. 
For a state q € Q,we write E(q) = {r £ Q \ {q, r) G E} for the set of successor states 
of q. A player-1 game is a game graph where Qi = Q and Q2 = 0. Player-2 games are 
defined analogously. 

Plays and strategies. A game on G starting from a state go S Q is played in rounds 
as follows. If the game is in a player-1 state, then player 1 chooses the successor state 
from the set of outgoing edges; otherwise the game is in a player-2 state, and player 
2 chooses the successor state. The game results in a play from qo, i.e., an infinite path 
P = {qoQi ■ ■ ■) such that {qi, qi+i) £ E for all i > 0. We write i? for the set of all plays. 
The prefix of length n of p is denoted by p{n) = qo . . . qn- A strategy for a player is a 
recipe that specifies how to extend plays. Formally, a strategy for player 1 is a function 
a : Q*Qi -)• Q such that {q, a{p ■ q)) & E for all p G Q* and q G Qi- The strategies 
for player 2 are defined analogously. We write S and 11 for the sets of all strategies for 
player 1 and player 2, respectively. 

An important special class of strategies are memoryless strategies which do not 
depend on the history of a play, but only on the current state. Each memoryless strategy 
for player 1 can be specified as a function a: Qx ^ Q such that (j{q) G E{q) for all 
q G Qi, and analogously for memoryless player 2 strategies. 

Given a starting state g G Q, the outcome of strategies a G for player 1, and tt G 
n for player 2, is the play uj{s, cr, tt) = {qoqi ■ ■ .) such that : qo = q and for aU > 0, 
if qk G Qi, then a{qo, gi, • • . , qk) = Qk+i, and if qk G Q2, then 7r(go, 9i, • • • , 9fe) = 
5fe+i- 

Payoff functions, optimal strategies. The objective of player 1 is to construct a play 

that maximizes a payoff function : 12 — > R U {—00, +00} which is a measurable 
function that assigns to every value a real- valued payoff. The value for player 1 is the 
maximal payoff that can be achieved against aU strategies of the other player. Formally 
the value for player 1 for a starting state q is defined as 

vali{(j)) = sup inf (^(a;(g, cr, tt)). 
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A strategy o* is optimal for player 1 from g if the strategy achieves at least the value of 
the game against all strategies for player 2, i.e., 

inf (j){uj{q,a* jTt)) =vali{(p). 

The values and optimal strategies for player 2 are defined analogously. 

The mean-payoff and discounted-sum functions are examples of payoff functions 
that are well studied, probably because they are simple in the sense that they induce 
memoryless optimal strategies and that this property yields conceptually simple fix- 
point algorithms for game solving [25, 11, 28, 12]. In an attempt to construct other sim- 
ple payoff functions, we define the class of weighted average payoffs which compute 
(infinite) weighted averages of the rewards, and we ask which payoff functions in this 
class induce memoryless optimal strategies. 

We say that a sequence {c„}„>o of rational numbers has no zero partial sum if 
X]r=o 7^ foi" 'ill " — 0- Given a sequence {c„}„>o with no zero partial sum, the 
weighted average payoff function for a play {qoqiq2 • • ■) is 

(p {qoqm . . . ) = lim mf . 

Note that we use lim inf „_j.oo in this definition because the plain hmit may not exist 
in general. The behavior of the weighted average payoff functions crucially depends on 
whether the series S = Yl^o converges or not. In particular, the plain Umit exists 
if S converges (and is finite). Accordingly, we consider the cases of converging and 
diverging sum of weights to characterize the class of weighted average payoff functions 
that admit memoryless optimal strategies for both players. Note that the case where 
Ci = 1 for alH > gives the mean-payoff function (and S diverges), and the case 
Cj = A' for < A < 1 gives the discounted sum with discount factor A (and S 
converges). All our results hold if we consider limsup„_^(^ instead of liminf„_>(X) 
in the definition of weighted average objectives. 

In the sequel, we consider payoff functions (j) : Q'^ — R with the implicit assump- 
tion that the value of a play go9i92 • • • G Q'^ according to ^ is <f>{w{qo,q\)w{q\,q2) ■ ■ ■) 
since the sequence of rewards determines the payoff value. 

We recall the following useful necessary condition for memoryless optimal strate- 
gies to exist [14]. A payoff function is monotone if whenever there exists a finite 
sequence of rewards a; e Q* and two sequences u, v e such that (p{xu) < (p{xv), 
then (l){yu) < 4>{yv) for all finite sequence of rewards y e Q*. 

Lemma 2.1 ([14]). If the payoff function ^ induces memoryless optimal strategy for all 
two-player game graphs, then (j) is monotone. 

3 Weighted Average with Converging Sum of Weights 

The main result of this section is that for converging sum of weights (i.e., if 
lim„_,.oo X^"=o Cj = c* € K), the only weighted average payoff function that induce 
memoryless optimal strategies is the discounted sum. 
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Fig. 1. Examples of one-player game graphs. 



Theorem 3.1. Let (c,i)„gk be a sequence of real numbers with no zero partial sum 
such that X^^o c« = c* 6 M. The weighted average payoff function defined by {Cfi)ii^f^ 
induces optimal memoryless strategies for all 2-player game graphs if and only if there 
exists < A < 1 such that Cj+i = A ■ Cifor all i>0. 

To prove Theorem 3.1, we first use its assumptions to obtain necessary conditions 
for the weighted average payoff function defined by (c„)„gN to induce optimal mem- 
oryless strategies. By assumptions of Theorem 3.1, we refer to the fact that (c„)„gN 
is a sequence of real numbers with no zero partial sum such that X^i^o Ci = c* G R, 
and that it defines a weighted average payoff function that induces optimal memoryless 
strategies for all 2-player game graphs. All lemmas of this section use the the assump- 
tions of Theorem 3.1, but we generally omit to mention them. 

Let dn — Y^^Zq Ci, I = liminf„^oo and L = limsup„_^o<3 j-. The assumption 
that J2Zo c^=c* gR impUes that Z ^ 0." 

Note that co 7^ since (cn)neN is a sequence with no zero partial sum. We can 
define the sequence c'^ = ^ which defines the same payoff function 0. Therefore we 
assume without loss of generality that co = 1. 

Lemma 3.1. If the weighted average payoff function defined by (c„)„gN induces opti- 
mal memoryless strategies for all 2-player game graphs, then < Z < L < 1. 

Proof. Consider the one-player game graph G\ shown in Fig. 1. In one-player games, 

strategies correspond to paths. The two memoryless strategies give the paths 0" and 1" 
with payoff value and 1 respectively. The strategy which takes edge with reward 1 
once, and then always the edge with reward gets payoff (f) (10") = lim inf „_).oo -j- = 



Lemma 3.2. There exists wo G N such that wol > 1 and the following inequalities 
hold, for all k > 0: CfeZ < 1 — dkL and CkWol > 1 — dkL. 

Proof Since 1 > Z > (by Lemma 3.1), we can choose wq € N such that wqI > 1. 
Consider the game graph G2 shown in Fig. 1 and the case when w = 1. The optimal 
memoryless strategy is to stay on the starting state forever because <j>{10^) = I < 
0(1^") = 1. Using Lennma 2.1, we conclude that since (^(10") < 0(1"), we must have 



0(0^10") < 0(0*^1") i.e. CfeZ < 1 - (Ei=o Ci) L which impUes c^Z < 1 - dkL. 
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Now consider the case when w = wq in Fig. 1 . The optimal memoryless strategy 

is to choose the edge with reward Wq from the starting state since 0(woO'^) = wqI > 
= 1. Using Lemma 2.1, we conclude that since 0(woO") > (/I)(l"), we must have 

^(O'^woO") > (/>(0'=l")i.e.Cftw;o^ > 1- (E»*=o L which implies CfeWo^ > l-dkL. 

□ 

From the inequalities in Lemma 3.2, it is easy to see that since > 1 we must 
have Cfe > for all k. 

Corollary 3.1. Assuming cq = 1, we have Ck > Ofor all k>0. 

It follows from Corollary 3.1 that the sequence {dn)n>o is increasing and bounded 
from above (if dn was not bounded, then there would exist a subsequence (dn^) which 
diverges, implying that the sequence {-r—} converges to in contradiction with the 

fact that liminffn.00 ^ = ^ > 0). Therefore, d„ must converge to some real number 
say c* > (since cq = 1). We need a last lemma to prove Theorem 3.1. Recall that 
we have > for all i and X^i^o = c* > 0. Given a finite game graph G, let W 
be the largest reward in absolute value. For any sequence of rewards {wn) in a run on 
G, the sequence Xn — X^i=o ^1(^1 + W) is increasing and bounded from above by 
2 • Wdn and thus by 2 • Wc*. Therefore, Xn is a convergent sequence and J^iLo ^i'^i 
converges as well. Now, we can write the payoff function as (j){wowi ■ ■ ■) = . 
We decompose c* into = Xli^o "-zi f^id Si = Xii^o C2i+i, i-e. c* = So + Si. Note 
that Sq and are well defined. 

Lemma 3.3. If there exist numbers a, /3, 7 such that aSo + j3Si < 'y{SQ + Si), then 
(7 - a)ci > (/3 - ■y)ci+i for all i > 0. 

Proof. Consider the game graph G4 as shown in Fig. 1. The condition aSo + (3 Si < 
^{So + Si) implies that the optimal memoryless strategy is to always choose the edge 
with reward 7. This means that (jf)(7*a/37'^) < (?!)(7'^) hence acj+^Cj+i < 7(ci+Cj+i), 
i.e. (7 — a)ci > {13 — 7)ci+i for alH > 0. □ 

We are now ready to prove the main theorem of this section. 

Proof (of Theorem 3.1). First, we show that Si < Sq. By contradiction, assume that 
5*1 > 5*0. Choosing a = 1, (3 — —1, and 7 = in Lemma 3.3, and since Sq — Si < 0, 
we get —Ci > — Cj+i for all z > which implies c„ > Cq = 1 for all n, which 
contradicts that J2^q Ci converges to c* e M. 

Now, we have 5i < 5o and let A = < 1. Consider a sequence of rational 

numbers ^ converging to A from the right, i.e., |^ > A for all n, and lim„_,.oo |^ = A. 
Taking a = 1, ^ = fc„ + + 1, and 7 = Z„ + 1 in Lemma 3.3, and since the condition 
'S'o + (fcn + ln + '^)Si < {In + l){So + Si) is equivalent to knSi < InSo which holds 
since > A, we obtain InCi > knCi+i for all n > and alH > 0, that is Cj+i < |^Ci 
and in the Umit for n — >■ 00, we get Cj+i < Acj for all i > 0. 

Similarly, consider a sequence of rational numbers converging to A from the left. 
Taking a = r„ + s„ + 1, /3 = 1, and 7 = s„ + 1 in Lemma 3.3, and since the condition 
{tu + Sn + l)5'o + < {sn + l)(S'o + 'S'l) is equivalent to r„5o < SnSi which holds 
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since ^ < A, we obtain r„Ci < SnCi+i for all n > and all i >0, that is Cj+i > j^Ci 
and in the limit for n — ?■ cxo, we get c,+i > Ac^ for alH > 0. 

The two results imply that Cj+i = Acj for alH > where < A < 1. Note that 
A / 1 because X^i^o converges. □ 

Since it is known that for Cj = A% the weighted average payoff function induces 
memoryless optimal strategies in all two-player games, Theorem 3.1 shows that dis- 
counted sum is the only memoryless payoff function when the sum of weights J2iLo 
converges. 

4 Weighted Average with Diverging Sum of Weights 

In this section we consider weighted average objectives such that the sum of the weights 
X^^o divergent. We first consider the case when the sequence (c„)„gN is bounded 
and show that the mean-payoff function is the only memoryless one. 

4.1 Bounded sequence 

We are interested in characterizing the class of weighted average objectives that are 
memoryless, under the assumption the sequence (c„) is bounded, i.e., there exists a 
constant c such that |c„| < c for all n. The boundedness assumption is satisfied by 
the important special case of regular sequence of weights which can be produced by a 
deterministic finite automaton. We say that a sequence {c„ } is regular if it is eventually 
periodic, i.e. there exist no > and p > such that c„+p = c„ for all n > uq. Recall 
that we assume the partial sum to be always non-zero, i.e., dn = ^^^o ^ for ail 
n. We show the following result. 

Theorem 4.1. Let (c„)„gN be a sequence of real numbers with no zero partial sum 
such that Xi^ol'-^l ~ °° (^^^ divergent) and there exists a constant c such that 

|ci I < c for all i > f the sequence is bounded). The weighted average pay ojf function (f> 
definedby {cn)nen induces optimal memoryless strategies for all 2-player game graphs 
if and only if(p coincides with the mean-payoff function over regular words. 

Remark. From Theorem 4. 1, it follows that all mean-payoff functions </> over bounded 
sequences that induce optimal memoryless strategies are equivalent to the mean-payoff 
function, in the sense that the optimal value and optimal strategies for (f> are the same as 
for the mean-payoff function. This is because memoryless strategies induce a play that 
is a regular word. We also point out that it is not necessary that the sequence (c„)„>o 
consists of a constant value to define the mean-payoff function. For example, the payoff 
function defined by the sequence c„ = 1 + 1/ (n 4-1)^ also defines the mean-payoff 
function. 

We prove Theorem 4. 1 through a sequence of lenomas. In the following lenmia we 
prove the existence of the hmit of the sequence {3^}n>o- 

Lemma 4.1. 7/'liminf„_,.oo = 0, then limsup„_^;^ 4- = 0. 
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Proof. Since I = liminf„_5.oo -L = g, there is a subsequence which either 

diverges to +oc or — oo. 

1. If the subsequence {dn^ } diverges to +00, assume without loss of generality that 
each dn^, > 0. Consider the one-player game graph G3 shown in Figure 1. We consider 
the run corresponding to taking the edge with weight —1 for the first Uk steps followed 
by taking the edge forever. The payoff for this run is given by 



-dn^ . 1. ]_ 



liminf = —dnk ■ limsup — = —d„^, ■ L. 



Since we assume existence of memoryless optimal strategies this payoff should lie be- 
tween — 1 and 0. This implies that dn^ ■ L < 1 for all k. Since L > I > and the 
sequence dn^ is unbounded, we must have L = 0. 

2. If the subsequence {dn^} diverges to — cxd, assume that each dn^. < 0. Consider 
the one-player game graph Gi shown in Figure 1. We consider the run corresponding 
to taking the edge with weight 1 for the first rife steps followed by taking the edge 
forever. The payoff for this run is given by 

liminf = -|d„^| • limsup = -|d„^| • L. 

This payoff should lie between and 1 (optimal strategies being memoryless), and this 
implies -L = as above. □ 

Since limsup„_^g^3 d„ = 00, Lemma 4.1 concludes that the sequence {^} con- 
verges to i.e. lim„_,.oo = 0. It also gives us the following corollaries which are a 
simple consequence of the fact that lim inf „_,.oo (ctn + bn) = a -|- liminf „_,.oo &n if o,n 
converges to a. 

Corollary 4.1. If I = 0, then the payoff function (p does not depend upon any finite 
prefix of the run, i.e., (f)(aia2 . ■ . Uku) = (j){0^u) = (/)(&i&2 • • • bku) for all ai's andbi's. 

Corollary 4.2. If I = 0, then the pay off function <p does not change by modifying finitely 
many values in the sequence {c„}„>o. 

By Corollary 4.1, we have (pixa'^) = a for all a G IR. For < z < fc — 1, consider 
the payoff Sk,i = (f) ((0U0'^~*~^)") for the infinite repetition of the finite sequence of 
k rewards in which all rewards are except the {i -\- l)th which is 1. We show that Sk,i 
is independent of i. 

Lemma 4.2. We have Sk,o = Sk,i = ■■■ = Sk,k-i < j. 

Proof. liSk,o < S'fc,i then by prefixingby the single letterword andusingLemma2.1 
we conclude that Sk,i < Sk,2- We continue this process until we get Sk,k-2 < •S'fe.fe-i- 
After applying this step again we get 

Sk,k-i < ct> (o(o'=-iir) = </. (i(o'=-iir) = <^ ((io'=-ir) = Sk,o. 

Hence, we have .S';, ,() < Sk,i < • ■ • < Sk.k-i < Sk,Q. Thus we have Sk,i is a constant 
irrespective of the value of i. A similar argument works in the other case when Sk,o > 
'S'fe.i. 
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Fig.2. The game G(A;,i). 



Using the fact that liminf„^oo(ai,n + 02, n + • • • + ak,n) > liminf„^oo ai,n + 
• • • + liminf„^oo ak,n, we get that Sk,i < j for 0<2<fc— 1. □ 

Let Tk^i = — (/) ((0*(— 1)0*^"*"^)"). By similar argument as in the proof of 
Lemma 4.2, we show that Tkfl = Tk,i = ■ ■ ■ = Tk^-i > 

We now show that ((i„) must eventually have always the same sign, i.e., there exists 
no such that s\gn{dm) = sign((i„) for all m,n > uq. Note that by the assumption of 
non-zero partial sums, we have d„ 7^ for all n. 

Lemma 4.3. The dn 's eventually have the same sign. 

Proof. Let c > be such that |c„| < c for all n. Since (d„) is unbounded, there 
exists no such that \dn\ > c for aU n > no and then if there exists m > uq such 
that dm > and dm+i < 0, we must have dm > c and dm+i < — c. Thus we have 
Cm+i = dm+i — dm < — 2c, andhence |cto+i| > 2c which contradicts the boundedness 
assumption on (c„). □ 

If the d„'s are eventually negative then we use the sequence {c^ = — c„} to ob- 
tain the same payoff and in this case rf„ = — Yl^o "^i will be eventually positive. 
Therefore we assume that there is some no such that dn > for all n > uq. Let 
13 = max{\cQ\, |ci|, . . . , |c„o|}. We replace cq by 1 and all Cj's with /J for 1 < z < no. 
By corollary 4.2 we observe that the payoff function will still not change. Hence, we 
can also assume that d„ > for all n > 0. 

Lemma 4.4. We have Sk,i = ^ = T^^iforall < i < fc — 1. 

Proof. Consider the game graph G{k,i) which consists of state qq in which the player 
can choose among k cycles of length k where in the ith cycle, all rewards are except 
on the {i + l)th edge which has reward 1 (see Fig. 2). 

Consider the strategy in state qq where the player after every k ■ r steps (r > 0) 
chooses the cycle which maximizes the contribution for the next k edges. Let v be 
the index such that kr < ir < kr + k — 1 and Cj^ = maxjcfcr, ■ • ■ , Ckr+k-i} for 
r > 0. The payoff for this strategy is lim inf „^oo in where t„ = ^'o+°n+ +g'r-i 

ir-l < n < ir- 

y^kr+k-1 ^ 

Note that Cj^ > '^''^ (the maximum is greater than the average), and we get 

the following (where c is a bound on ( | c„ | ) „ > ) : 

En— 1 

k ' dyi dji 

1 c 1 

hence lim inf tn> - — lim inf — = — . 

n— >oo k n^oo dn k 
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By Lemma 4.2, the payoff of all memoryless strategies in G(fc, i) is Skfi, and the fact 
that memoryless optimal strategies exist entails that Skg = liminf„_,.oo tn > ^, and 
thus Sk.o = i = Sk,t for all < i < fc - 1. 

Using a similar argument on the graph G{k,i) with reward —1 instead of 1, we 
obtain Tfe,o = = 7^,^ for all < i < fc - 1. □ 

From Lemma 4.4, it follows that 

Sk,i = <^((Ono'=-^-i)") = lim 



and hence, 



.((aoai...afe-in = lim inf V a»- ^-=° = V «^ ' ^^=°/'"-+' 

i=0 \ / »=0 



E/c — i 

k 



We show that the payoff of a regular word u = 6162 • • • &m(«oai • • • o-k-iY 
matches the mean-payoff value. 

Lemma 4.5. Ifu:= 6162 • • • &m(ctoai • • • flfe-i)" and v = (aoai . . . afe_i)" are two 
regular sequences of weights then (j){u) = 4>{v) = ' • 

Proof. Let r e N be such that kr > m. If (]){v) < (f){Ov) then using Lemma 2.1 
we obtain <j>{Qv) < 4>{0'^v). Applying the lemma again and again, we get, 4){v) < 
0(O™w) < (/)(0'=''u). From Corollary 4.1 we obtain (/)(0™v) = 0(^162 • • • bmv) = <j){u) 

and (j){Q'^^v) = <t){{aia2 ■ ■ .auYv) = (j){v). Therefore, = (j){v) = °' . The 
same argument goes through for the case (j){v) > <p{Ov). □ 

Proof (of Theorem 4.1). In Lemma 4.5 we have shown that the payoff function must 
match the mean-payoff function for regular words, if the sequence {c„}„>o is bounded. 
Since memoryless strategies in game graphs result in regular words over weights, it 
follows that the only payoff function that induces memoryless optimal strategies is the 
mean-payoff function which concludes the proof. □ 

Observe that every regular sequence is bounded, and therefore the result of Theo- 
rem 4.1 holds for all weighted average objectives with divergent sum defined by regular 
sequence of weights. 

Corollary 4.3. Let (c„)„gN be a regular sequence of real numbers with no zero partial 

sum such that X]i=ol'^^l ~ °° ^^^^ '^^'"^ ''^ divergent). The weighted average payoff 
function (j) defined by {cn)n&i induces optimal memoryless strategies for all two-player 
game graphs if and only ifcj) is the mean-payojf function. 
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4.2 Unbounded sequence 

The results of Section 3 and Section 4.1 can be summarized as follows: (1) if the sum 
of Ci's is convergent, then the sequence {A*}i>o, with A < 1 (discounted sum), is the 
only class of payoff functions that induce memoryless optimal strategies; and (2) if the 
sum is divergent but the sequence (c„) is bounded, then the mean-payoff function is the 
only payoff function with memoryless optimal strategies (and the mean-payoff function 
is defined by the sequence {A*}i>o, with A = 1). The remaining natural question is that 
if the sum is divergent and unbounded, then is the sequence {A*}i>o, with A > 1, the 
only class that has memoryless optimal strategies. Below we show with an example that 
the class {A*}, with A > 1, need not necessarily have memoryless optimal strategies. 

We consider the payoff function given by the sequence c„ = 2". It is easy to 
verify that the sequence satisfies the partial non-zero assumption. We show that the 
payoff function does not result into memoryless optimal strategies. To see this, we ob- 
serve that the payoff for a regular word w = bobi . . . bt{aoai . . . Uk-i)'^ is given by 

— 'i^2+---+2*'-^ '^'''^ ) i-^-' ^® payoff for a regular word is the least 

possible weighted average payoff for its cycle considering all possible cyclic permuta- 
tions of its indices (note that the addition in indices is performed modulo k). 




2 



Fig. 3. The game ^1024. 

Now, consider the game graph ^1024 shown in figure 3. The payoffs for both 
the memoryless strategies (choosing the left or the right edge in the start state) are 
min (|, I) and min (|, |) which are both equal to |. Although, if we consider the 
strategy which alternates between the two edges in the starting state then the payoff ob- 
tained is min (||,Y|,Y|,-i|) = ^ which is less than payoff for both the memoryless 
strategies. Hence, the player who minimizes the payoff does not have a memoryless op- 
timal strategy in the game Gin24- The example estabUshes that the sequence {2"}„>o 
does not induce optimal strategies. 

Open question. Though weighted average objectives such that the sequence is diver- 
gent and unbounded may not be of the greatest practical relevance, it is an interesting 
theoretical question to characterize the subclass that induce memoryless strategies. Our 
counter-example shows that {A"}„>o with A > 1 is not in this subclass. 
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